AITopics | pr description

Collaborating Authors

pr description

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Sleep-time Compute: Beyond Inference Scaling at Test-time

Lin, Kevin, Snell, Charlie, Wang, Yu, Packer, Charles, Wooders, Sarah, Stoica, Ion, Gonzalez, Joseph E.

arXiv.org Artificial IntelligenceApr-18-2025

Scaling test-time compute has emerged as a key ingredient for enabling large language models (LLMs) to solve difficult problems, but comes with high latency and inference cost. We introduce sleep-time compute, which allows models to "think" offline about contexts before queries are presented: by anticipating what queries users might ask and pre-computing useful quantities, we can significantly reduce the compute requirements at test-time. To demonstrate the efficacy of our method, we create modified versions of two reasoning tasks - Stateful GSM-Symbolic and Stateful AIME. We find that sleep-time compute can reduce the amount of test-time compute needed to achieve the same accuracy by ~ 5x on Stateful GSM-Symbolic and Stateful AIME and that by scaling sleep-time compute we can further increase accuracy by up to 13% on Stateful GSM-Symbolic and 18% on Stateful AIME. Furthermore, we introduce Multi-Query GSM-Symbolic, which extends GSM-Symbolic by including multiple related queries per context. By amortizing sleep-time compute across related queries about the same context using Multi-Query GSM-Symbolic, we can decrease the average cost per query by 2.5x. We then conduct additional analysis to understand when sleep-time compute is most effective, finding the predictability of the user query to be well correlated with the efficacy of sleep-time compute. Finally, we conduct a case-study of applying sleep-time compute to a realistic agentic SWE task.

compute, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2504.13171

Country: North America > United States (0.28)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Interactive Agents to Overcome Ambiguity in Software Engineering

Vijayvargiya, Sanidhya, Zhou, Xuhui, Yerukola, Akhila, Sap, Maarten, Neubig, Graham

arXiv.org Artificial IntelligenceFeb-18-2025

AI agents are increasingly being deployed to automate tasks, often based on ambiguous and underspecified user instructions. Making unwarranted assumptions and failing to ask clarifying questions can lead to suboptimal outcomes, safety risks due to tool misuse, and wasted computational resources. In this work, we study the ability of LLM agents to handle ambiguous instructions in interactive code generation settings by evaluating proprietary and open-weight models on their performance across three key steps: (a) leveraging interactivity to improve performance in ambiguous scenarios, (b) detecting ambiguity, and (c) asking targeted questions. Our findings reveal that models struggle to distinguish between well-specified and underspecified instructions. However, when models interact for underspecified inputs, they effectively obtain vital information from the user, leading to significant improvements in performance and underscoring the value of effective interaction. Our study highlights critical gaps in how current state-of-the-art models handle ambiguity in complex software engineering tasks and structures the evaluation into distinct steps to enable targeted improvements.

information, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2502.13069

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Add feedback

Automatic Pull Request Description Generation Using LLMs: A T5 Model Approach

Sakib, Md Nazmus, Islam, Md Athikul, Arifin, Md Mashrur

arXiv.org Artificial IntelligenceAug-1-2024

Developers create pull request (PR) descriptions to provide an overview of their changes and explain the motivations behind them. These descriptions help reviewers and fellow developers quickly understand the updates. Despite their importance, some developers omit these descriptions. To tackle this problem, we propose an automated method for generating PR descriptions based on commit messages and source code comments. This method frames the task as a text summarization problem, for which we utilized the T5 text-to-text transfer model. We fine-tuned a pre-trained T5 model using a dataset containing 33,466 PRs. The model's effectiveness was assessed using ROUGE metrics, which are recognized for their strong alignment with human evaluations. Our findings reveal that the T5 model significantly outperforms LexRank, which served as our baseline for comparison.

commit message, dataset, pr description, (13 more...)

arXiv.org Artificial Intelligence

2408.00921

Country:

North America > United States > Idaho > Ada County > Boise (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(2 more...)

Genre:

Overview (0.86)
Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback